symmetric function
Symmetric Aggregation of Conformity Scores for Efficient Uncertainty Sets
Alami, Nabil, Zakharia, Jad, Taieb, Souhaib Ben
Access to multiple predictive models trained for the same task, whether in regression or classification, is increasingly common in many applications. Aggregating their predictive uncertainties to produce reliable and efficient uncertainty quantification is therefore a critical but still underexplored challenge, especially within the framework of conformal prediction (CP). While CP methods can generate individual prediction sets from each model, combining them into a single, more informative set remains a challenging problem. To address this, we propose SACP (Symmetric Aggregated Con-formal Prediction), a novel method that aggregates nonconformity scores from multiple predictors. SACP transforms these scores into e-values and combines them using any symmetric aggregation function. This flexible design enables a robust, data-driven framework for selecting aggregation strategies that yield sharper prediction sets. We also provide theoretical insights that help justify the validity and performance of the SACP approach. Extensive experiments on diverse datasets show that SACP consistently improves efficiency and often outperforms state-of-the-art model aggregation baselines.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Research Report > New Finding (0.67)
- Research Report > Promising Solution (0.54)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
A universal compression theory: Lottery ticket hypothesis and superpolynomial scaling laws
Wang, Hong-Yi, Luo, Di, Poggio, Tomaso, Chuang, Isaac L., Ziyin, Liu
When training large-scale models, the performance typically scales with the number of parameters and the dataset size according to a slow power law. A fundamental theoretical and practical question is whether comparable performance can be achieved with significantly smaller models and substantially less data. In this work, we provide a positive and constructive answer. We prove that a generic permutation-invariant function of $d$ objects can be asymptotically compressed into a function of $\operatorname{polylog} d$ objects with vanishing error. This theorem yields two key implications: (Ia) a large neural network can be compressed to polylogarithmic width while preserving its learning dynamics; (Ib) a large dataset can be compressed to polylogarithmic size while leaving the loss landscape of the corresponding model unchanged. (Ia) directly establishes a proof of the \textit{dynamical} lottery ticket hypothesis, which states that any ordinary network can be strongly compressed such that the learning dynamics and result remain unchanged. (Ib) shows that a neural scaling law of the form $L\sim d^{-α}$ can be boosted to an arbitrarily fast power law decay, and ultimately to $\exp(-α' \sqrt[m]{d})$.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Netherlands (0.04)
- Research Report (0.64)
- Contests & Prizes (0.61)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
A Representation Theory for Ranking Functions
Harsh H. Pareek, Pradeep K. Ravikumar
This paper presents a representation theory for permutation-valued functions, which in their general form can also be called listwise ranking functions. Pointwise ranking functions assign a score to each object independently, without taking into account the other objects under consideration; whereas listwise loss functions evaluate the set of scores assigned to all objects as a whole. In many supervised learning to rank tasks, it might be of interest to use listwise ranking functions instead; in particular, the Bayes Optimal ranking functions might themselves be listwise, especially if the loss function is listwise. A key caveat to using listwise ranking functions has been the lack of an appropriate representation theory for such functions. We show that a natural symmetricity assumption that we call exchangeability allows us to explicitly characterize the set of such exchangeable listwise ranking functions. Our analysis draws from the theories of tensor analysis, functional analysis and De Finetti theorems. We also present experiments using a novel reranking method motivated by our representation theory.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Africa > Senegal > Kolda Region > Kolda (0.04)
Cirbo: A New Tool for Boolean Circuit Analysis and Synthesis
Averkov, Daniil, Belova, Tatiana, Emdin, Gregory, Goncharov, Mikhail, Krivogornitsyna, Viktoriia, Kulikov, Alexander S., Kurmazov, Fedor, Levtsov, Daniil, Levtsov, Georgie, Vaskin, Vsevolod, Vorobiev, Aleksey
We present an open-source tool for manipulating Boolean circuits. It implements efficient algorithms, both existing and novel, for a rich variety of frequently used circuit tasks such as satisfiability, synthesis, and minimization. We tested the tool on a wide range of practically relevant circuits (computing, in particular, symmetric and arithmetic functions) that have been optimized intensively by the community for the last three years. The tool helped us to win the IWLS 2024 Programming Contest. In 2023, it was Google DeepMind who took the first place in the competition. We were able to reduce the size of the best circuits from 2023 by 12\% on average, whereas for some individual circuits, our size reduction was as large as 83\%.
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Europe > Austria > Upper Austria > Linz (0.04)
Reinforcement Learning the Chromatic Symmetric Function
Bérczi, Gergely, Klüver, Jonas
We propose a conjectural counting formula for the coefficients of the chromatic symmetric function of unit interval graphs using reinforcement learning. The formula counts specific disjoint cycle-tuples in the graphs, referred to as Eschers, which satisfy certain concatenation conditions. These conditions are identified by a reinforcement learning model and are independent of the particular unit interval graph, resulting a universal counting expression.
On the Representational Efficiency of Restricted Boltzmann Machines James Martens Richard Zemel Department of Computer Science
This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)? We characterize the RBM's unnormalized log-likelihood function as a type of neural network, and through a series of simulation results relate these networks to ones whose representational properties are better understood. We show the surprising result that RBMs can efficiently capture any distribution whose density depends on the number of 1's in their input. We also provide the first known example of a particular type of distribution that provably cannot be efficiently represented by an RBM, assuming a realistic exponential upper bound on the weights. By formally demonstrating that a relatively simple distribution cannot be represented efficiently by an RBM our results provide a new rigorous justification for the use of potentially more expressive generative models, such as deeper ones.
A Representation Theory for Ranking Functions
This paper presents a representation theory for permutation-valued functions, which in their general form can also be called listwise ranking functions. Pointwise ranking functions assign a score to each object independently, without taking into account the other objects under consideration; whereas listwise loss functions evaluate the set of scores assigned to all objects as a whole. In many supervised learning to rank tasks, it might be of interest to use listwise ranking functions instead; in particular, the Bayes Optimal ranking functions might themselves be listwise, especially if the loss function is listwise. A key caveat to using listwise ranking functions has been the lack of an appropriate representation theory for such functions. We show that a natural symmetricity assumption that we call exchangeability allows us to explicitly characterize the set of such exchangeable listwise ranking functions. Our analysis draws from the theories of tensor analysis, functional analysis and De Finetti theorems. We also present experiments using a novel reranking method motivated by our representation theory.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Africa > Senegal > Kolda Region > Kolda (0.04)
Uniform $\mathcal{C}^k$ Approximation of $G$-Invariant and Antisymmetric Functions, Embedding Dimensions, and Polynomial Representations
Ganguly, Soumya, Tran, Khoa, Sarkar, Rahul
For any subgroup $G$ of the symmetric group $\mathcal{S}_n$ on $n$ symbols, we present results for the uniform $\mathcal{C}^k$ approximation of $G$-invariant functions by $G$-invariant polynomials. For the case of totally symmetric functions ($G = \mathcal{S}_n$), we show that this gives rise to the sum-decomposition Deep Sets ansatz of Zaheer et al. (2018), where both the inner and outer functions can be chosen to be smooth, and moreover, the inner function can be chosen to be independent of the target function being approximated. In particular, we show that the embedding dimension required is independent of the regularity of the target function, the accuracy of the desired approximation, as well as $k$. Next, we show that a similar procedure allows us to obtain a uniform $\mathcal{C}^k$ approximation of antisymmetric functions as a sum of $K$ terms, where each term is a product of a smooth totally symmetric function and a smooth antisymmetric homogeneous polynomial of degree at most $\binom{n}{2}$. We also provide upper and lower bounds on $K$ and show that $K$ is independent of the regularity of the target function, the desired approximation accuracy, and $k$.
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (8 more...)